r 



ED 023 187 



DO C [ M F * T H F ^ l* M F 



24 



EA 001 686 



By -SattSer, Jerome M. 

Effects of Graduated Cues on Performance on Two Wechsler Subtest s . Final Report. 

San Diego State Coll., Calif. 

Spons Agency -Office of Education (DHEW), Washington, DC. Bureau of Research. 

Bureau No-BR -7 -8057 
Pub Date Dec 67 
Grant -OEG -4 -7 -078057 -0402 
Note -201p. 

EDRS Price MF-S100 HC -$10.15 ^ _ 

Descriptors ■ ^Adolescents , Bibliographies , *Cues, Grade 7, Grade 8, Grade 9, Literature Reviews, er ormancc 
Factors, Psychological Testing, Psychological Tests, *Test Resul ts, Test Validity . TCr 

Identifiers -SCAT, Wechsler Bellevue Intelligence Scale Form 1 , Wechsler Intelligence Scale for Children, W1SG 

The effects of alterations in test procedure upon the original and repeated test 
performance of normal adolescents are determined for two subtests— Block Design 
(BD) and Picture Arrangement (PA) — appearing in the Wechsler Intelligence Scale tor 
Children and Wechsler Bellevue Intelligence Scale Form 1. Two experiments were 
conducted, one with 170 eighth and ninth grade students and the other with 1 
seventh and eighth grade students. The first experiment used only the BD subtest, 
while the second used both the BD and PA subtests. In both experiments an 
alternative form of the subtest was administered immediately after the first, with help 
given on the first administration only. The results included: (1) Administering help and 
giving cues did not affect test performance of the first experiment, but did ar feet 
that of the second, (2) different examiners do not obtain significantly different test 
-core- (3) there is little difference between sexes in the test scores, (4) grades are 
poor predictors of the BD and PA subtests, and (5) SCAT scales are highly correlated 

with grades . (HW) 





2# 



FINAL REPORT 
Project No. 7-8057 
Grant No. OEG-4-7-078057.0402 



EFFECTS OF GRADUATED CUES ON PERFORMANCE ON 
TWO WECHSLER SUBTESTS 



December 1967 



U. S. DEPARTMENT OF 
HEALTH, EDUCATION, AND WELFARE 

Office of Education 
Bureau of Research 



<o 





! 

7-S>o&? 



j 



FINAL REPORT 
Project No. 7-8057 
Grant No. OEG-4-7-078057-0402 



EFFECTS OF GRADUATED CUES ON PERFORMANCE ON 
TWO WECHSLER SUBTESTS 



Jerome M. Sattler 
San Diego State College 
San Diego, California 92115 



December 1967 



The research reported herein was performed pursuant to a 
grant with the Office of Education, U. S. Department of 
Health, Education, and Welfare. Contractors undertaking 
such projects under Government sponsorship are encouraged 
to express freely their professional judgment in the 
conduct of the project. Points of view or opinions 
stated do not, therefore, necessarily represent official 
Office of Education position or policy. 



U. S. DEPARTMENT OF 
HEALTH, EDUCATION, AND WELFARE 

Office of Education 
Bureau of Research 



U.S. DEPARTMENT OF HEALTH, EDUCATION & WELFARE 
OFFICE OF EDUCATION 



THIS DOCUMENT HAS BEEN REPRODUCED EXACTLY AS RECEIVED FROM THE 
PERSON OR ORGANIZATION ORIGINATING IT. POINTS OF VIEW OR OPINIONS 
STATED DO NOT NECESSARILY REPRESENT OFFICIAL OFFICE OF EDUCATION 
POSITION OR POLICY. 



Acknowledgments 



Numerous people have supported the project throughout 
its many phases. At the University of North Dakota, where 
the project originally began, President George Starcher, 
Vice-President William Koenker, and Controller George 
Skogley were extremely generous with their time and their 
encouragement. The interlibrary loan department of the 
University of North Dakota was always helpful in obtaining 
many references. 

At San Diego State College, the Psychology Department, 
the Division of Life Sciences, and the San Diego State 
College Foundation have supported the project in innumer- 
able ways. William Erickson, Foundation Manager, has 
struggled through the problems of budgeting and has always 
shown an understanding for the problems encountered by the 
investigator. Dr. Ernest 0* Byrne, Vice-President for 
Administration, and his Foundation staff have been a 
pleasure to work with. The interlibrary loan department, 
and in particular, Mildred Le Compte, has been very helpful 
in enabling the investigator to secure numerous publica- 
tions. Her patience was very much appreciated. The 
computer center at the College, too, provided help in the 
data analyses. 

The Sweetwater Union High School District of Southern 
California merits our sincerest thanks for allowing their 
students to participate in the project. Milton Grossman, 
in charge of special education for the District, arranged 
for the facilities. His kindness and support were 
extremely generous. Mar Vista Junior High School in 
Imperial Beach, California, and National City Junior High 
School in National City, California willingly allowed us 
to use their facilities, and made their records available. 
A number of staff members at each school worked with us 
to ensure that the students were of average ability. They 
worked with the community and handled the difficulties 
that occasionally arose. These educators gave generously 
of their time, and their help was most appreciated. Our 
thanks go to Dale Newell, Lois Kruse, George Hester, 

George Prout, Chester Smith, and Martin Hunting from Mar 
Vista Junior High School; and to William Scarborough, Ida 
Harris, Ivin Heathman, Keith Fink, and C. William Veazey 
from National City Junior High School. 

The four examiners, William Kef alas, Larry Rigg, 
Rosemary Roth, and Leonard Tozier, played an important 
role in the research project. Their professional com- 
petencies in administering the tests, in establishing 



i • 

11 



r 






rapport with the subjects and school staff, and in per- 
forming the statistical analyses were excellent. Their 
help has been most appreciated. 

Over four-hundred students participated. Their 
cooperation was excellent, and it is hoped that their 
participation provided them with a meaningful experience. 



*4 \ 

Ns. > 



* • # 
111 





Table of Contents 

Page 

Acknowledgments ii 

1. Introduction 1 

2. Experiment 1 — Method 27 

3. Experiment 1 — Results 33 

4. Experiment 2 — Method 59 

5. Experiment 2 — Results 71 

6. Discussion 115 

7. Conclusions and Recommendations 129 

8. Summary 134 

References 137 

Appendixes 

A. Recording form used in Experiment 1 for WISC- 

WB order 148 

B. Recording form used in Experiment 1 for WB- 

WISC order 149 

C. Recording form for incorrect WB BD used in 

Experiment 1 150 

D. Recording form for incorrect WISC BD used in 

Experiment 1 151 

E. Table 44, Raw Data for Experiment 1 152 

F. Recording form for WB BD used in 

Experiment 2 164 

G. Recording form for WISC BD used in 

Experiment 2 165 

H. Recording form for WB PA used in 

Experiment 2 166 

I. Recording form for WISC PA used in 

Experiment 2 167 

J. Recording form for incorrect WB BD used in 

Experiment 2 168 

K. Recording form for incorrect WISC BD used in 

Experiment 2 169 

L. Table 45, Raw Data for Experiment 2 170 

M. Grant publication reprint: "Comments on 

Cieutat's 'Examiner Differences with the 
Stanford-Binet IQ"' 185 

N. Grant publication reprint: "Statistical 

Reanalysis of Canady' 
on the I.Q*: A New A 

of Racial Psychology' 



189 



ERIC Report Resume 

List of Tables 

Table Page 

1. Sample for Experiment 1 28 

2. Mean Block Design Scores for Examiners, Condi- 
tions, and Administrations 34 

3. Analysis of Variance of Block Design Scores for 

Examiners, Conditions, and Administrations 36 

4. Mean Block Design Scores for Sex, Conditions, 

and Administrations 37 

5. Analysis of Variance of Block Design Scores for 

Conditions, Sex, and Administrations 38 

6. Mean Block Design Scores for Condition by Test 

Form Orders and Administrations 40 

7# Analysis of Variance of Block Design Scores for 

Condition by Test Form Orders and Administrations 41 

8. Mean Help Items for Conditions, Test Form Orders, 

and Examiners 42 

9. Analysis of Variance of Help Items for Test Form 

Orders, Examiners, and Conditions 43 

10. Mean Ages in Months for Condition by Test Form 

Orders and Sex 45 

11. Analysis of Variance of Age for Condition by Test 

Form Orders and Sex 46 

12. Mean SCAT Scores for Examiners, Conditions, and 

SCAT Scales 47 

13. Analysis of Variance of SCAT Scores for Examiners, 

Conditions, and SCAT Scales 49 

14. Mean SCAT Scores for Sex, Conditions, and SCAT 

Scales 50 

15. Analysis of Variance of SCAT Scores for Conditions, 

Sex, and SCAT Scales 51 

16. Intercorrelation Matrix for Variables Common to 

All Subjects in Experiment 1 53 

17. Correlations of Other Subject Areas With Variables 

Common to All Subjects in Experiment 1 54 

18. Sample for Experiment 2 60 

19. Counterbalanced Orders 62 

20. WB and WISC PA Card Order for Help-Step 2 and 

Help-Step 3 54 

21. Mean BD and PA Scores for Conditions, Sex, and 

Administrations 72 

22. Analysis of Variance of BD and PA Scores for 

Conditions, Sex, Administrations, and Subjects 73 

23. Mean BD and PA Scores for Examiners, Conditions, 

and Administrations 74 

24. Analysis of Variance of BD and PA Scores for 

Conditions, Examiners, Administrations, and 
Subtests 75 



v 



Table 



Page 



25. Mean BD and PA Scores for Condition by Orders 

and Administrations 77 

26. Analysis of Variance of BD and PA Scores for 
Condition by Orders, Administrations, and 

Subtests 78 

27. Individual Mean Subtest Comparisons within each 

Condition by Order Administration 80 

28. Mean Help Items for Experimental Condition for 

Orders, Subtests, and Examiners 81 

29. Analysis of Variance of Help Items for Examiners, 

Orders, and Subtests 82 

30. Mean Help Steps for Experimental Condition for 

Orders, Subtests, and Examiners 84 

31. Analysis of Variance of Help Steps for Examiners, 

Orders, and Subtests 85 

32. Mean Ages in Months for Sex and Condition by 

Orders 86 

33. Analysis of Variance of Age for Condition by 

Orders and Sex 87 

34. Mean SCAT Scores for Conditions, Examiners, and 

SCAT Scales 88 

35. Analysis of Variance of SCAT Scores for Condi- 
tions, Examiners, and SCAT Scales 89 

36. Mean SCAT Scores for Conditions, Sex, and SCAT 

Scales 91 

37. Analysis of Variance of SCAT Scores for Conditions, 

Sex, and SCAT Scales 92 

38. Intercorrelation Matrix for Variables Common to 

All Subjects in Experiment 2 94 

39. Intercorrelation Matrix for Variables Common to 

All Experimental Subjects in Experiment 2 97 

40. Intercorrelation Matrix for Variables Common to 

All Control Group Subjects in Experiment 2 102 

41. Correlations of Other Subject Areas with Variables 

Common to All Groups in Experiment 2 105 

42. Correlations of Other Subject Areas with Vari- 

ables Common to all Experimental Subjects in 
Experiment 2 109 

43. Correlations of Other Subject Areas with Vari- 
ables Common to all Control Subjects in 

Experiment 2 111 

44. Raw Data of Experiment 1 152 

45. Raw Data of Experiment 2 170 

List of Figures 

Figure Page 



1. Arrangements of blocks for each help- step 
condition for nine block designs 

vi 



o 



30 



List of Figures 



^ Figure 

2. Arrangement of blocks for help-step 3, The X in 
the diagram indicates the blocks arranged for 
the subject. 




vii 



Page 

62 



3 



1. Introduction 
Problem 

The psychological evaluation of a child's abili- 
ties is an extremely important task, especially when 
the child has learning or emotional difficulties. 
Problem children are usually evaluated by psycholog- 
ical assessment techniques which are individually 
administered. Group tests are less useful because, 
for example, it is more difficult to evaluate 
whether the child is trying his best, or whether 
directions are clear. Problem children frequently 
do not work at their maximum capacity, and group 
testing procedures do not provide methods for con- 
trolling or enhancing motivational level. 

Intellectual assessment, usually a part of the 
evaluation procedure, is performed by using one of 
a number of available individually administered 
intelligence tests such as the Stanford-Binet (S-B) 
or Wechsler Intelligence Scale for Children (WISC) . 
These tests require strict adherence to standardized 
procedures in order to ensure reliable and valid 
results. However, in many cases the child's 
physical, psychological, or cultural handicaps may 
impede the examiner's ability to administer the tests 
according to strict administrative procedures . 

Children with emotional blocks, bilingualism, 
and physical difficulties such as cerebral palsy, 
deafness, blindness, or organic brain damage, 
usually perform less adequately on many tests. 

Their lowered performance may be due in part to 
their inability to comply with the task demands, 
rather than to a lack of knowledge per se. Thus, 
some children may be able to solve a problem if 
given more time, more explicit instructions, more 
trials, more help in understanding the task demands, 
or if they are permitted to answer the problem using 
more efficient sensory modalities. Little is known, 
however, about how emotional and physical difficul- 
ties affect performance on various test items, or 
about the effects of alterations in administrative 
test procedures on test performance. 



1 



While it is important to study emotionally dis 
turbefa^d physically 

performance of normal chll f^^e the effects of 
procedures designed to 1 ^ ves ^ 1 |? ld be studied prior 
modifying standard .P^ oc f d r ^ roups Data derived from 

comparison pur- 

poses . 

^e problem of the present^investigation^is^to 

rrupo: the SgSl 

° f S° r ?M hel^tlps on tSo Wechsie^' subtests 
graduated hel P ? (pA ) an d Block Design (BD) , are 

Picture Arrangement (PA) ana f ^ investigation pro- 

investigated. The r , , v helD steps and test 

vide data ^^^"^^"'perflLance and school grades ; 

Ef'SSSi differences in 

ili 2; t “si;rsr"sVnf.r;.stn2. a. in 

the investigation. 

Background and Review of Related Research 

n- n QfifU in reviewing the situational and 
Masling (I960), in , e J tive testing, con- 

interpersonal variables i P ^ _ v affect test 

eluded that such variables signifxc, «tly. a« ^ area 

results. Because sim important to evaluate 

of intelligence testing, 1 P intelligence test 

the extent to^huc^they also affect ^ by \ est 

authors and writers ®PP®® r ^ ^ following 6 the 

sSSJiSBffS »SS! ' 

tures from standard procedures, situauiuia . 

examiner variables , /^.^^^^^“"not reviewed ; 
rather^the focus is”on experiments employing one 
subject at a time. 






2 



Departures from Standard Procedures 

Numerous writers (e.g., Cronbach, 1960; Freeman, 
1962; Terman & Merrill, 1960; Wechsler , 1949) empha- 
size the importance of following standard procedures 
in administering individual intelligence tests. 
According to Terman and Merrill (1960) , "The disci- 
pline of the laboratory has furnished the training 
ground for instilling respect for standard proce- 
dures /p. 477." Cronbach (I960), in discussing 
adherence to standard instructions, notes: Any 

departure from standard administrative practice 
changes the meaning of scores /p. 1857." Wechsler 
(1949) indicates that instructions and questions 
must be read exactly as written in the test manual. 
However, research on the effects of departing from 
standard procedures is scant and the results are only 

suggestive . 

In requiring standard test procedures the test 
authors do not take into account subject variations 
or the possibility that a more accurate estimate of 
intellectual ability can be obtained, on some occa- 
sions, by "violation" of standard procedures. The 
examiner who deems it desirable to go beyond the 
standard test instructions in order to assess present 
or potential intellectual ability has usually read 
that such procedures may interfere with or affect 
the final test results. Freeman (1962) recognized 
that extratesting procedures are desirable when the 
examiner wishes to evaluate additional facets of the 
subject's abilities. However, he pointed out that 
such procedures should be attempted after the formal 
testing has been completed in order to maintain 
standardized procedures. 

Modifications in test procedure have been sug- 
gested, especially when evaluating exceptional 
subjects. A rationale for test modifications is 
offered by Schonell (1956): When subject's respon- 

ses are "adversely affected because of his physical 
or sensory handicap, it seems reasonable to modify 
the administration and/or scoring of the test, if no 
other test is suitable for the particular individual 
/p. 407." In another article (Schonell, 1958) she 
writes: "while all precautions should be taken to 

adhere as closely as possible to test instructions, 



3 



occasions arise with some badly handicapped children 
when a pedantic adherence to the instructions will 
produce a result not only unfair to the individual 
but quite incorrect and misleading /p. 137_/." Wells 
and Ruesch (1945) suggested that when administering 
the Wechsler-Bellevue Intelligence Scale (WB) 
"phraseology may be modified so long as essential 
content is unchanged /p. 14 3/." It is permissible, 
they suggest, to allow subjects who are hard of hear- 
ing or who are very bright to see the problems. 
Kessler (1966) recognized that the testing of blind 
and deaf subjects poses a particular problem: "Stan- 

dard intelligence tests have to be modified to allow 
for the lack of sight or hearing, and it is question- 
able how far they can be changed and still be 
comparable to conventional test results /p. 344/." 

Newland (1963) suggested that a number of alter- 
natives are available to the examiner in making test 
adaptations: "the examiner may read the standardized 

test items to blind subjects, may allow a child to 
use a typewriter in giving his responses if he has a 
major speech or handwriting problem, may observe the 
eye movements of the subject as he identifies parts 
of a test item (where other children might write or 
point with their fingers in responding) , might start 
with motor items rather than with verbal items in 
the case of a child whose problem involves the commu- 
nication area, or might even rearrange some Binet 
items into WISC /Wechsler Intelligence Scale for 
Children/ form if research warranted taking such 
liberties with the material /p. 69/." 

Eisenson (1954) also recommended that on some 
occasions the examiner should not stay within the 
confines of the standard administrative procedures. 

He suggested, for example, that standardized tests 
administered to aphasic patients, "should be used to 
aid in formulating clinical judgments rather than as 
a means of trying to get a quantitative index . . . . 

Modifications in administering the tests and of 
evaluating the responses are usually necessary, or 
at least desirable, in order to elicit the clearest 
picture of the patient's intellectual functioning. 
Time limits may be ignored and roundabout defini- 
tions accepted /p. 4 [/." Such modifications, he 
recognized, preclude the use of test norms. 



Multiple-choice administration of the Picture 
vocabulary test of the Stanford-Binet Intelligence 
Scale (S-B) is criticized by Burgemeister (1962) . 

By suggesting things to the subject "the test item 
is admittedly less difficult for not requiring 
recognition and recall elements of the Stanford 
presentation. Credit thus given is, of course, dis- 
torting scores in favor of the cerebral-palsied 
patient /p. 1177." 

Rephrasing WISC questions has been recommended 
recently. Eisenman and McBride (1964) suggested that 
some rural subjects may be penalized if the wording 
of the "balls" Comprehension item is used. Coyle 
(1965) suggested that the COD Information item be 
rephrased, in some cases, in order to avoid loss of 
rapport and an underestimation of the subject's 
potential . 

Burgemeister (1962) , Coyle (1965) , Eisenman and 
McBride (1964), Eisenson (1954), Kessler (1966), 
Newland (1963) , and Wells and Ruesch (1945) offered 
no data to indicate how modifications may affect the 
reliability or validity of the test results. These 
writers recognized that modifications may preclude 
the use of test norms, but data were not presented 
which indicate how modifications affect test norms. 

Many other writers, too, (e.g., Allen, 1959; 

Katz, 1956; Michael-Smith, 1955; Portenier, 1942; 
Strother, 1945) have been concerned with the problems 
encountered in evaluating handicapped children by 
traditional assessment devices, and they advocated 
the use of test modifications. As can be seen by 
Newland' s (1963) above comments, the examiner must 
be very resourceful in devising methods for modify- 
ing standard procedures. 

In a survey designed to determine the tests 
used for the intellectual evaluation of normal and 
handicapped children, Braen and Masling (1959) found 
that modifications of standardized intelligence tests 
(e.g., S-B and WISC) often occur in the assessment 
procedure; the specific modifications, however, were 
not reported by the respondents. The use of modifi- 
cations thus implies that many individually 
administered intelligence tests cannot be administered 
in a standardized manner. Braen and Masling (1959) 



5 



also pointed out that modifications do not Permit 
the use of the standardized norms, because the 
modified test produces a different > form of the test 
which does not have known reliability and validity. 

The studies available in the area of departures 
from standard procedures are not conclusive. They 
have dealt with limited segments of WB subtests, 
have evaluated different orders of administration of 
S-B or Wechsler Adult Intelligence Scale (WAIS) . 
items, or have studied procedural changes. Indi- 
vidual WB subtests were the focus m three of the 
five studies reporting significant results. 

Guertin (1954) reported that college subjects 
performed better on the WB Arithmetic when the more 
difficult items were administered first than when 
the conventional order was followed. Evaluating 
three different placements of the WB Digit Span, 
Klugman (1948) reported that psychoneurotic subjects 
obtained the highest scores when the subtest 
appeared in the middle of the test, next highest 
scores when the subtest appeared at the beginning, 
and the lowest scores when the subtest appeared a 
the end. Hutton (1964) administered the S-B (L-M) 
and WXSC Digit Span to 60 subjects, the majon y 
w£om were retarded . Significantly higher scores were 
obtained on the S-B digit repetition ^ems. Because 
the Full Scale WISC was not administered, the results 
cannot be accepted as indicating differences between 
ths S-B and WISC ps]T • 

Hutt (1947) by alternating hard and easy items 
(adaptive method) on the S-B (L) , was able to Pro- 
duce a sianificant gain in IQ scores with poorly 
adjusted lubjects ringing from kindergarten to ninth 
grade. The adaptive method, however , did not Pro- 
duce a significant difference for a well-adjusted 
arllp? Paralleling Butt's (1947) findings are those 
of°Greenwood and Taylor (1965) . An adaptive method 
with the WAIS was used: Each subtestwasbegunwith 

an item below the subject's anticipated mental lev 
and easy and hard items were alternated by using 
scalePteras or a pool of similarly easy items The 
adaptive method resulted m significantly ^creased 
retest scores for subjects between 65 and 75 years 
of age, but not for above-average college subjects. 



6 



Serial administration of the S-B (grouping 
items of the same content together) has been evalu- 
ated in two studies. Frandsen, McCullough, and 
Stone (1950) administered the S-B (forms L & M) 
under conventional and serial orders to subjects 
from 5 to 18 ; years of age, and no significant IQ 
differences were found. Spache (1942) , while not 
experimentally manipulating any variables, computed 
two S-B (L) IQs, one standard and one based upon 
items that could be arranged serially. Significant 
differences were not found in test scores for a 
group of gifted subjects between two and nine years 

of age. 

Procedural changes have not been found signi- 
ficant in five studies. Allowing elderly subjects 
(over 60) unlimited time on the WAIS made very 
little difference in their test scores (Doppelt & 
Wallace, 1955) . Affleck and Frederickson (1966) 
found that scoring the WAIS Picture Arrangement on 
the basis of four consecutive failures failed to 
make a significant difference for a group of 
671 subjects. The IQ was affected by the new scor- 
ing rule in only 2.8% of the cases, and only in 
three cases did the Full Scale IQ change by as much 
as two points. Schonell (1956) used three different 
methods to compute S-B IQs and found that, for a 
group of 354 cerebral palsied children, 74% obtained 
identical IQs. The first was the standard method of 
computing the IQ (tested IQ); the second, a modified 
IQ, credited the subject with passing items which 
the examiner judged the subject would have passed if 
not for the subject’s disability; and the third, an 
estimated IQ, established an IQ based upon the 
examiner's estimate of the subject's overall ability. 
Schonell concluded that these computational modifi- 
cations do not significantly affect the overall test 

results . 

Mogel and Satz (1963) studied an abbreviated 
administration of the WAIS and concluded that dis- 
ruption in the continuity of item difficulty has a 
negligible effect on test results; 60 neuropsychi- 
atric patients served as subjects in a test-retest 
design. Norris, Hottel, and Brooks (1960) found 
that individual and group administration of the 
Peabody Picture Vocabulary Test to 60 fifth grade 



7 



subjects of average intelligence resulted in similar 
mean scores. Practice increased the mean IQ by only 
one point. 

Studies concerned with departures from standard 
procedures do not appear to strongly confirm the 
assumption that modifying standard procedures 
seriously affects the overall test results. Of the 
12 studies reviewed, 5 reported significant results, 
while 7 reported nonsignificant results. Modifica- 
tions in procedures, at times, affect only certain 
subject populations. Children and college age sub- 
jects are usually not affected (six of seven studies 
employing these groups reported no significant 
effects) , while specialized groups composed of either 
elderly, disturbed, or retarded tend to be affected 
by the departures (four of seven studies employing 
these groups reported significant effects) . In light 
of the limited number of studies, the rather minute 
procedural changes often studied, and the fact that 
some studies demonstrate a significant effect result- 
ing from departures from standard procedures, the 
examiner should follow standard procedures. However, 
Littell's (1960) conclusion from his review of the 
WISC is apropos: "The possible effects of differ- 

ences in the examiner's techniques of administration 
is another problem area which has not received the 
attention it merits . . . /p. 1467." 

Situational Variables 

A variety of attempts have been made to alter 
the testing conditions systematically. They have 
ranged from varying incentive and ego involvement 
to using money, praise, and other reinforcement 
procedures. This section reviews 20 studies; signi- 
ficant findings appeared in 5, nonsignificant in 12, 
and both significant and nonsignificant in 3. 

Subjects between approximately 9 and 14 years 
of age were studied in four of the five studies with 
significant findings. Failure, frustration, or dis- 
couragement appeared as a variable in all five 
studies with significant findings. Lantz (1945) 
found that 9-year-old males, when examined with the. 
S-B (L & M) , had lower scores after a failure experi- 
ence. Success experience, on the other hand, did 
not significantly increase their scores. 



8 



Discouragement significantly lowered the S-B (L & M) 
scores of eighth grade subjects (Gordon & Durea, 

1948) and the S-B (L) scores of above-average fifth 
and sixth grade subjects (Pierstorff, 1951). Solkoff 
(1964) evaluated the effects of three degrees of 
frustration on WISC Coding performance of 36 brain- 
injured, 9-year-old male subjects. High frustration 
(interrupting a marble game task and withholding of 
a promised reward) significantly impaired performance 
compared to low frustration or control conditions. 
Schizophrenics exposed to a failure experience had 
lower scores than a control group of schizophrenics 
on a repeated administration of a test similar to the 
WB Similarities (Webb, 1955) . 



Discouragement, anxiety, or distraction was 
evaluated in eight of the studies with nonsignificant 
results, and college students were employed m seven 
of these. A positive administration, characterized 
bv an approving and interested manner, and a negative 
administration, characterized by a rejecting and dis- 
interested manner, did not significantly affect 
college subjects' performance on a short form of the 
WAIS (Murdy, 1962). College subjects' Digit Symbol 
performance was similar under success, failure, and 
neutral conditions (Mandler & Sarason, 1952) . In 
Walker et al.'s (1965) study, three different failure 
conditions resulted in similar WAIS Object Assembly 
performances. Failure condition scores were also not 
significantly different from control condition 
scores. 2 Truax and Martin (1957) found that WB 
Arithmetic scores of college females were similar 
under mild and severe threat conditions; for the 
total group, however, performance w ? s b ® bter for 
subjects tested after a 24-hour period than for those 
tested immediately after the threat was induced. 
Anxiety and/or distraction failed to affect Digit 
Span performance in three different studies employing 
college subjects or newly admitted psychiatric 
patients (Craddick & Grossman, 1962; Guertm, 195 , 
Walker & Spence, 1964). Three different incentive 
conditions— verbal praise, verbal reproof, and candy- 
employed by Tiber and Kennedy (1964) had no signifi- 
cant effect on second and third grade white and Negro 
subjects' ’ S-B (L-M) scores. 



2 R, E. Walker, personal communication, May 1966 



0 



9 



The remaining four studies with nonsignificant 
results used some form of ego involvement; three used 
college subjects. Achievement-oriented and neutral 
instructions resulted in similar scores on the four 
WAIS subtests (Comprehension, Vocabulary, Digit 
Symbol, Block Design) administered to 96 college sub- 
jects by Sarason and Minard (1962) . Guertin (1954) 
found that when college subjects received instruc- 
tions designed to minimize resignation attitudes 
their WB Arithmetic scores were similar to scores 
obtained under standard conditions. Nichols (1959) 
employed 11 examiners and evaluated subjects' per- 
formance under two conditions of ego involvement and 
two conditions of success. No significant effects 
on WB scores were found for any of the variables with 
superior college subjects. Klugman (1944) found that 
money and praise incentives had similar effects on 
S-B (L & M) scores of subjects between the ages of 7 
and 14. 

Both significant and nonsignificant findings 
have been reported in three studies. Gallaher (1964) 
administered the WB (II) Digit Symbol to female 
volunteer college subjects. A month later a diffi- 
cult vocabulary test was administered to experimental 
groups concomitantly with either positive or negative 
examiner remarks, or with an extended series of diffi- 
cult tests at which subjects failed. The WAIS Digit 
Symbol was then administered. While change scores 
were not affected by the examiner's remarks, the 
three experimental groups performed significantly 
better (higher change scores) than the control group 
on the second Digit Symbol. Griffiths (1958) 
reported that experimentally induced anxiety impaired 
WB Digit Span and Information scores. However, the 
college subjects' scores on Arithmetic, Object 
Assembly, and Digit Symbol were not adversely 
affected. Moldawsky and Moldawsky (1952) equated 
college subjects for verbal intelligence and then 
administered in a counterbalanced order the WB Digit 
Span and Vocabulary under anxiety and neutral condi- 
tions. One significant effect was found: Vocabulary- 
Digit Span order under the anxiety condition produced 
lower Digit Span scores, while Vocabulary scores were 
not affected. 

It has often been suggested that anxiety disrupts 
immediate memory. Some of the studies reviewed in 
this and other sections specifically investigated 

10 



memory ability in relation to the various experi- 
mental conditions.' Seven studies reported significant 
findings — a decrement in memory functioning — as a 
result of such factors as adjustment, anxiety, dis- 
couragement, failure, location, method of 
presentation, and rapport (Exner, 1966; Gordon & 

Durea, 1948; Griffiths, 1958; Hutton, 1964; Klugman, 
1948; Pierstorff, 1951; Young, 1959). Nonsignificant 
findings have been reported in six investigations 
which studied anxiety, distraction, failure, time, 
and the examiner's race (Craddick & Grossman, 1962; 
Doppelt & Wallace, 1955; Forrester fit Klaus, 1964; 
Guertin, 1959; Lantz , 1945; Walker & Spence, 1964), 
and one reported both significant and nonsignificant 
results (Moldawsky fit Moldawsky, 1952). Other stud- 
ies have also incorporated digit span items, but the 
vulnerability of these items to the experimental con- 
ditions cannot be evaluated because specific items 
were not reported. The evidence, however, suggests 
that immediate memory, as measured by digit-span 
performance, is susceptible to procedural, situational 
and interpersonal factors . 

Generalizations concerning the effects of situ- 
ational variables on test performance must be 
tentative. Discouragement is likely to affect the 
performance of children between 9 and 14 years of age, 
but not of college subjects. Praise has never been 
reported to produce significantly better performance 
than control or other experimental conditions. Little 
attention has been devoted to the effects of situ- 
ational variables on emotionally disturbed groups. 

The results suggest that children are especially vul- 
nerable to discouragement. 

Examiner Variables 

The examiner has often been cautioned to prevent 
his test administration from being influenced by his 
impression of the subject — the "halo" effect. Scor- 
ing, probing, and inquiring may be affected by the 
examiner's impression of whether the subject may be 
able to answer the questions. Burgemeister (1962) 
illustrates the "halo" effect in the examination of 
cerebral palsied subjects: "Motivated by a feeling 

of sympathy often reinforced by seeing the physical 
energy expended by so many palsied children in follow- 
ing instructions, the examiner easily believes his 
hope, i.e., that the child knows more than he can 
express, and hence overestimates the child's ability 

/p. 1177." 



11 



McFadden (1931) observed in an experiment employ- 
ing the S-B that examiners may differ in giving help 
and in leniency in scoring: "This makes comparisons 

of different examiners liable to error when the sub- 
tests are considered /pp. 62, 647." Goodenough (1940) 
also discussed the possibility of systematic errors m 
test administration and in test scoring. She noted 
that no experiment had been reported, at the time she 
wrote her article, which evaluated how the examiner s 
knowledge of the subject's scores obtained on previous 
examinations may affect the examiner's testing proce- 
dures. Even now, little information is available 
concerning the very important point raised by 
Goodenough. Ekren (1962) has, however, evaluated the 
effect of the examiner's knowledge of the subject s 
ability upon test scores. Eight undergraduate male 
examiners were led to believe that half of their sub 
jects were earning high grades in school, and that 
the other half were earning lower grades. Because 
similar WAIS Block Design scores were obtained for 
the two groups, Ekren (1962) concluded that the knowl- 
edge variable had no significant effect. 

Turning to studies evaluating the examiner- 
subject relationship, Sacks (1952) administered the 
S-B (L & M) to 3-year-old subjects. On a repeated 
test administration, a good relationship between 
the examiner and the subject produced a significantly 
greater gain than a poor relationship. However, while 
not significantly different from the control group, 
the poor relationship group also obtained higher 
scores on the repeated test administration. Exner 
(1966) studied the effect of examiner rigidity m 33 
pairs of subjects from 7 to 14 years of age. Sub- 
jects in each pair were initially matched on age, sex, 
and S-B IQ. The WISC was administered to 25 pairs m 
the conventional order and to 8 pairs in a reversed 
order. Compared to rapport conditions, rigid condi- 
tions resulted in lower Verbal and Performance IQs 
under the conventional order of administration and a 
lower Performance IQ under the reversed order of 
administration. The effect of the rigid examiner, 
condition was most noticeable on the subtests admin- 
istered early in both conventional and reversed order 
administrations . 

Hardis (1955) administered the WB (I and II) to 
40 male adolescents under rapport and standard 



conditions in a test-retest design. Verbal, Perfor- 
mance, and Full Scale IQs, scatter patterns, and 
change scores did not differ between the two condi- 
tions . 

Hata, Tsudzuki, Kuze, and Emi (1958) evaluated 
test-retest scores as a function of the examiner- 
subject relationship and the subject's personality. 

The subjects were assigned to either a preferred or 
nonpreferred examiner. A group IQ test was first . 
administered by the classroom teacher. Nine examiners 
then administered an individual IQ test to 147 12-year- 
old subjects. Results indicated that subjects 
examined by a preferred examiner received improved 
scores on the individual test, as compared to sub- 
jects examined by a nonpreferred examiner. The 
subjects with a favorable or neutral attitude toward 
people also had improved scores with preferred 
examiner, while those subjects with a less favorable 
attitude toward people did not significantly improve 
with a preferred examiner. 

Schizophrenic subjects have also been studied. 

An authoritarian and an understanding examiner admin- 
istered the WAIS Similarities and Block Design and a 
number of other measures to process and reactive 
schizophrenics and nonschizophrenic Veterans Adminis- 
tration males (Gancherov, 1963) . Process 
schizophrenics were the only group having signifi- 
cantly lower scores when tested by the understanding 
examiner. Schupper (1955) investigated the effects 
of an accepting and rejecting relationship on the 
performance of schizophrenic males. The subjects 
were placed in one of two groups depending upon the 
age at which they were first hospitalized. Both 
groups had lower WB Similarities scores under the 
rejecting condition. Understanding examiners lower 
the scores of process schizophrenics, while rejecting 
examiners lower the scores of schizophrenics not dif- 
ferentiated as to process or reactive types. It is 
likely, however, that some subjects in Schupper 's 
(1955) study were of the process type. These appar- 
ently contradictory results are difficult to explain. 

Young (1959) investigated personality patterns 
of both subjects and examiners. * Using the Digit Span, 
he reported that "Subjects with 'poorly adjusted' 



13 



experimenters performed better than subjects with 
’well adjusted’ experimenters, male subjects did 
better than female subjects, and digits forward were 
easier than digits backward /p. 315J These exam- 
iners were college students from introductory 
psychology classes and should not be equated with 
examiners having graduate training or professional 
experience . 

Familiarity with the examiner has been studied 
in two investigations. Marine (1929) reported that 
subjects between 3 years, 8 months and 8 years, 

3 months who were familiar with the examiner did not 
perform in a significantly different manner on the 
1916 S-B than those not familiar with the examiner. 

In contrast, mentally retarded subjects well 
acquainted or slightly acquainted with the examiner 
obtained higher scores on intelligence tests of the 
S-B and WB types than those subjects tested by a 
strange examiner (Tsudzuki et al., 1956). 

The examiner's experience has been evaluated 
in five investigations, and four reported no signi- 
ficant differences between trained and less trained 
examiners. Jordan (1932) had 76 second- and third- 
year undergraduate examiners administer the 1916 S-B 
in a test-retest design, and a reliability coefficient 
of .84 was obtained. Jordan concluded, by comparing 
his results to published data, that inexperienced 
and experienced examiners obtain equally reliable 
IQs. Curr and Gourlay (1956) studied 8 trained and 
10 untrained examiners who administered the S-B (L) 
to 8- and 9-year-old subjects in a test-retest 
design. The examiner's training was not found to be 
a significant variable. Plumb and Charles (1955) , 
studying the WB, and Schwartz (1966) , studying the 
WAIS, reported that both experienced and inexperi- 
enced examiners have essentially similar scoring 
disagreement patterns on Comprehension items. 

Masling (1959) did not find a significant relation- 
ship between the number of tests examiners had 
previously administered and (a^) leniency in scoring 
and (b) number of reinforcing comments. In contrast 
to the nonsignificant findings reported above, 

LaCrosse (1964) found that test-retest scores were 
significantly different as a function of the number 
of tests the examiners had previously administered. 

14 



0 



The examiner's race has been considered an impor- 
tant variable in testing for many years. Strong 
(1913), in a study employing the Binet-Simon Measuring 
Scale of Intelligence, noted that it was possible that 
the Negro subjects might have obtained different 
results with an examiner of their own race. Pressey 
and Teter (1919) questioned "whether tests given by 
white examiners to colored pupils can give reliable 
data for a comparison of the races /p. 278y. Garth 
(1922-23) felt that white subjects might have an 
advantage over Indian and Negro subjects when these 
groups are tested by a white examiner. Blackwood (1927) 
wrote that more research was needed to evaluate the 
effects of rapport and motivation in testing , especially 
when subjects and examiners are of different races. 
Klineberg (1935, 1944) suggested that poor rapport may 
exist between Negro subjects and white examiners. He 
indicated that testing Negro subjects in the South 
presents a special problem because the white examiner 
may "face an attitude of fear and suspicion which is 
certain to interfere with the performance of an intel- 
lectual task /1935, p. 156/." 



The earliest reported investigation concerning 
the effect of examiner's race on intelligence test 
results was conducted by Canady (1936) . He used the 
1916 S-B and employed one Negro and 20 white exam- 
iners. Sattler (1966b), by reanalyzing Canady's 
data, showed that examiner's race interacts with the 
subject's race. On the first test administration 
subjects obtained higher IQs with examiners of their 
own race, while on the repeated examination subjects 
obtained higher IQs with examiners of the opposite 
race. LaCrosse (1964) found that a white examiner 
obtained significantly lower S-B (L-M) retest scores 
when testing Negro subjects who had been previously 
examined by a Negro examiner. The same white exam- 
iner, on the other hand, obtained significantly 
higher retest scores with white subjects previously 
tested by white examiners. Forrester and Klaus 
(1964) reported that on the S-B (L-M) 24 Negro kinder- 
garten subjects achieved higher IQs when examined by 
a female Negro examiner than when they were examined 
by a female white examiner. However, the interaction 
between subject's race and test administration was 
not significant. 



15 



Studies have evaluated the effects of white exa 
Lners on the performance of either Negro subjects or 
3 ? both Negro and white subjects. Pasamanick and 

Knobloch (1955) -ncluded^hat racial ^areness^^a 
SET v.rb*i r. T n.iv.n=.» 
whitfexaminer h on Ihf lesIlfDevel^ental ^ination * 

better in a money incentive condition tha^i hand< 

perSrme! similarly in the two ||^dy°U964) 

In contrast to Klugman u? ' react similarly to 

as: ss?ss ss^FHsS: 1 " 

examiners had similar test-retest change scores. 

The examiner variable has been evaluatedrithout 

22*32 

obtained by examiners in the Harvard 
scores obtained y ig65) reported that out of 13 

i MSS fMS M l; . 

examiners ^^^^^Slenbreported ” 1 Nichols (1959) 
Lebovitz , 1966) have been p examiners 

reported no examiner ( ^| 2 f rep orted no 

Sn^dffle^ 

Apgar A! ( 1958 ? n investigated the changes between S-B (L) 



3 S. F. Klugman, personal communication, December 

1966. 

4 Race of examiner inferred. 

5 F. F. Schachter, personal communication, 
November 1966. 



16 



